Newton: Gravitating Towards the Physical Limits of Crossbar Acceleration

نویسندگان

  • Anirban Nag
  • Ali Shafiee
  • Rajeev Balasubramonian
  • Vivek Srikumar
  • Naveen Muralimanohar
چکیده

Many recent works have designed accelerators for Convolutional Neural Networks (CNNs). While digital accelerators have relied on near data processing, analog accelerators have further reduced data movement by performing in-situ computation. Recent works take advantage of highly parallel analog in-situ computation in memristor crossbars to accelerate the many vector-matrix multiplication operations in CNNs. However, these in-situ accelerators have two significant short-comings that we address in this work. First, the ADCs account for a large fraction of chip power and area. Second, these accelerators adopt a homogeneous design where every resource is provisioned for the worst case. By addressing both problems, the new architecture, Newton, moves closer to achieving optimal energy-per-neuron for crossbar accelerators. We introduce multiple new techniques that apply at different levels of the tile hierarchy. Two of the techniques leverage heterogeneity: one adapts ADC precision based on the requirements of every sub-computation (with zero impact on accuracy), and the other designs tiles customized for convolutions or classifiers. Two other techniques rely on divide-and-conquer numeric algorithms to reduce computations and ADC pressure. Finally, we place constraints on how a workload is mapped to tiles, thus helping reduce resource provisioning in tiles. For a wide range of CNN dataflows and structures, Newton achieves a 77% decrease in power, 51% improvement in energy efficiency, and 2.2× higher throughput/area, relative to the state-of-the-art ISAAC accelerator.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Local stability criterion for self-gravitating disks in modified gravity

We study local stability of self-gravitating fluid and stellar disk in the context of modified gravity theories which predict a Yukawa-like term in the gravitational potential of a point mass. We investigate the effect of such a Yukawa-like term on the dynamics of self-gravitating disks. More specifically, we investigate the consequences of the presence of this term for the local stability of t...

متن کامل

Stability Assessment of the Flexible System using Redundancy

In this study, dynamic behavior of a mooring line in a floating system is analyzed by probability approaches. In dynamics, most researches have shown the system model and environments by mathematical expression. We called this process as the forward dynamics. However, there is a limit to define the exact environments because of uncertainty. To consider uncertainty, we introduce the redundancy i...

متن کامل

A quasi-Newton acceleration for high-dimensional optimization algorithms

In many statistical problems, maximum likelihood estimation by an EM or MM algorithm suffers from excruciatingly slow convergence. This tendency limits the application of these algorithms to modern high-dimensional problems in data mining, genomics, and imaging. Unfortunately, most existing acceleration techniques are ill-suited to complicated models involving large numbers of parameters. The s...

متن کامل

A new method for 3-D magnetic data inversion with physical bound

Inversion of magnetic data is an important step towards interpretation of the practical data. Smooth inversion is a common technique for the inversion of data. Physical bound constraint can improve the solution to the magnetic inverse problem. However, how to introduce the bound constraint into the inversion procedure is important. Imposing bound constraint makes the magnetic data inversion a n...

متن کامل

Weighted Arbitration Algorithms with Priorities for Input-Queued Switches with 100% Throughput

Input buffered switches have the strong advantage of efficient crossbar usage. Virtual Output Queueing (VOQ) has to be established to circumvent the head-of-line (HOL) blocking which limits the throughput to 58.6%. Arbitration algorithms control the access to the switch fabric in each time slot. Weighted algorithms achieve 100% throughput with lowest delays under all admissible traffic even und...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018